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This paper presents a new method for computing the parameters which 
determine the differential equations governing a linear time-invariant 
system with multiple inputs and outputs. Unlike earlier approaches the 
method presented does not involve computation of the impulse response. 
One of the main advantages of this method is its easy generalization to the 
case when the given data is contaminated with noise. 

The identification of multiple input-output linear systems has been 
a problem of considerable interest because of its importance in circuit 
and control system theory. In circuit theory the problem is that of 
synthesizing a linear time invariant circuit to exhibit a prescribed 
input-output behavior. In control theory, however, the problem arises 
out of a need to model a given linear system with a suitable set of 
differential equations, given its input-output behavior. References 1, 
2, and 3 deal with the problem of determining the parameters of the 
differential equation model from the impulse response. To the best of 
the author's knowledge, there is no published method which deter- 
mines the impulse response from a finite segment of input-output 
data in the case of systems with more than one input and output. 

1101 



1102 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1969 

I. THE "STATE INVARIANT" DESCRIPTION 

In most applications of identification techniques, one is given only 
a record of input sequences and a record of output sequences, rather 
than the impulse response function. In these cases it seems best to 
get an internal description of the system directly from these data; 
that is, avoid the intermediate step of synthesizing the impulse re- 
sponse. In many applications the structure of systems that are being 
identified remains the same, while values of parameters change. 
Therefore, it is convenient to work in a certain coordinate frame which 
is fixed for the given system. Most important of all, a method of arriv- 
ing at the values of parameters directly from input-output data is 
easier to analyze than the method in which impulse response is syn- 
thesized, since the sensitivity of intermediate computations re- 
quired to obtain the impulse response matrix need not be analyzed. 

The problem is therefore formulated as follows. Let 2 be a linear 
system in discrete time modeled by equations (1) and (2) : 

x(s + 1) = Fx(s) + Gu(s) (1) 

y(fi) = Hx(s). (2) 

x(s) e E n (the "n" dimensional Euclidean space) is the state of 2 at 
time s; similarly u{s) and y(s) are the m-dimensional input and the 
p-dimensional output of 2. F, G, H are real constant matrices of ap- 
propriate dimensions. 2 is assumed to be completely reachable and 
completely observable (for details about these terms see Ref. 4), 
namely 

rank of [G, FG, ■ • • , F^G] = n (3) 

and 

rank of [H' } F'H', • • • , F'^H'] = n (4) 

where prime (') denotes the transpose. Given a sequence of inputs u(s) 
and outputs y(s) for s = 1, 2, • • • , N (where N is sufficiently large), 
find a system D of the same dimension as 2 namely n such that 2 
simulates the input-output behavior of 2. 

Remark 1: It is clear that there are some sequences u(s) which will 
not be sufficient to uniquely specify S. Theorems, presented in Section 
II, give sufficient conditions for u(s) and N which uniquely determine S. 
Remark 2: When S is uniquely determined it will be shown that the 
state of S is uniquely related to the state of 2. In fact the F, (j, and 
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6 of t will be related to the F, G, and H of 2 by a nonsingular trans- 
formation such that HF'Q = ftP'G which implies that the impulse 
responses of 2 and 2 are identical. Notice that for any nonsingular T 

ft = HT~ l F = TFT- 1 G = TG 

implies that 

HF'G = HF f d. 

The main difficulty in obtaining a direct algorithm is in getting at the 
state x{s) from output sequences when the parameters of the system 
are not known. When, for example, H in the equation below is identity, 
or equivalently the output itself is the state, it is easy to find an in- 
ternal description from sequences of inputs and outputs. From writing 
this equation as 



x(s + 1) = [F G\ 



X(8) 

lu(s). 



y(s) = Hx(s) = x(s), 

it follows that given enough observations one can solve for F and G 
from the above equation for most nontrivial input sequences (see 
Theorem 2). An easy way is to multiply both sides of this equation 
by [.t/(s) u'(s) ] and sum from s = 1 to s = N where N is the number 
of observations: 



w 



£ {X(S + 1)[X'(8) 

(-1 



u'(s)]\ = [F G] E 



x(s) 
Lw(s). 



[*'(«) «'(«)] 



Whenever the matrix multiplying [F G] in the above equation has 
an inverse, there exists a unique solution for F and G. 

In the case when y(s) is not the state itself but only a linear 
function of the state, the problem is much more complex and one 
has to select certain appropriate components of the output sequence 
for an external description in terms of the observables, namely y{i) 
and u{i). The selection of the right components can be done by in- 
troducing an operator to be called the selector matrix as defined below. 

In describing the theory of the direct identification method, con- 
siderable use is made of the input-output description to be detailed 
below. 

Definition: S will denote the set of k X I matrices (fc ^ I) with the 
following properties: 
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(i) S = {sn} where s if = or 1. 
(it) V ,• , s,, = 1 for one and only one ;', say ;',• . 
(Hi) ;, < j 2 < • • • < ;* , ji ^ I, i ^ k. 
Examples of matrices belonging to S are 



(5) 
(6) 

(7) 



[1] 



1 
1 



1 
1 



and so on . 



Any matrix S e S will be referred to as a selector matrix, because S 
operating on a linear space E' transforms it into a linear space E k by 
mapping every vector x t E l to a vector y t E k by selecting the com- 
ponents j, , • • • , it of x e E l . 

The description presented is an "external" description in the sense 
that the dynamical equations are given in terms of quantities which 
can be observed from outside, that is, values of input and values of 
output. 

Consider a completely reachable and completely observable discrete 
time system 5 represented as follows 

x(s + 1) = Fx(s) + Gu(s), (8) 

y(s) = Hx(s), H: pXn; F: n X n; G: n X m. (9) 

"Completely observable" implies* 

p([F' : H']) = n. (10) 

p(A) = rank of A. Therefore, 3 an S e S, such that 

r h "1 

S = T where T is nonsingular; (11) 

that is, jT _1 exists. Without loss of generality it can be assumed from 
remark 2 that T = I so far as the external description is concerned. 
Using equations (8) and (9) repeatedly, it follows that 

y(s) - Hx(s), 

y(s + 1) - #z(s + 1) = HFx(s) + #(?w(s) (12) 

?/(s + n - 1) = HF"-'x(s) + HF"- 2 Gu(s) + • • • + #(?u(s + n - 2). 
Let 



* [F* : m a r#', F'if, • • • , j*6"-«>in. 
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y'(s) = [y'(s) y'(s + 1) • • • y'(s + n - 1)], (13) 

u'(s) = [u'(s) u'(s + 1) • •• u'(s + n - 1)]. (14) 

Then, writing equation (12) in vector form, and also using equations 

(13) and (14), it follows that 



m = 



// 

HF 
HF"- 



*(«) + 




HG 



HF"- 2 G HF-'G 



u(s). (15) 



Lett 












HG 








HFG 


HG 







HFG 


HG 









ol 





























HG 








HFG 


HG 


0. 



= #1 ; (16) 



HF n ~ 3 G HF n ~ 4 G HF n ~ 5 G 

lHF n ~ 2 G HF"- 3 G HF n - A G 
then multiplying both sides of equation (15) by S, using the comments 
given below equation (11) , it follows that 

Sy(s) = x(s) + SRMs). (17) 

Once again, using equation (9) , 

x(s + 1) = Fx(s) + Gu(s), 
which because of equation (17), with s replaced by s + 1, reduces to 

x(s + 1) = Sy(s + 1) - SRMs + 1); (18) 

substituting equation (9) for x{s) in equation (17) gives 

H 
HF 



Sy(s + 1) = F(Sy(s) - SR^s)) + S 



HF"- 1 . 



Gu(s) + SRMs+ 1). 
(19) 



t The last column of zeroes in Si is added so that y and u may be consistently 
defined. 
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Since it has been shown that [see equation (11) ] 



S 



H 
HF 

HF n ~ 



= I, 



(20) 



it follows that 



where:}: 



Sy(s + 1) = FSy(s) + Ru(s), 



(21) 



R = -FSR, + £ 



HG 
HFG 
HF 2 G 











HG 
HFG 
• 



(22) 



HF"~ l G HF n ~ 2 G • • HG. 

Equation (21) gives a relation between the input sequence u{i) and 
the output sequence y(i) which does not involve the state. It is an 
external description in the sense that the variables in equation (21), 
namely u(i) and y(i), can be measured externally. From equation 
(22) it follows that if R is partitioned as 



[R R l • • • #„_,], Rt : n X m V i, 



then 



Rn-X = S 



HG 



(23) 



(24) 



It is obvious how one obtains the columns of the second product 
in equation (22). To obtain the contribution from —FSRi, notice from 
equation (16) that S times the second column from the end of Ri 
is, from equation (24), merely Rn-i- Therefore, the second column of 
FSRi from the end is simply ^jRn-i and therefore 



t In adding SRiiKs + 1) to the second term in equation (20), the last column 
of Ri may be dropped because it is all zeroes. 
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/,'„ 2 = -FR n - x + S • (25) 

HG 

HFG. 

Now notice that R„- 2 + FR^ is S times the third column from the 
end of Ri. Therefore the third column from the end of R is 



tf n _ 3 = -FRn-2 - FX-t + 5 



HG 
HFG 
HF 2 G 



Continuing in the same way, 



/e (i _ 4 = -FR n .a - F 2 R„- 2 - FX-. + 8 







HG 
HFG 
HF*G 
HF*G 



and finally 



S 



HG 
HFG 



= /?„ + FR X + ■■■ + F" '/?„_, . 



HF n ~ l G_ 
Now, since it was possible to choose a basis such that 



HG 



S 



= IG, 



(26) 



HF"~ l G. 



1J08 
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one has 

G = R + FR, + • • • F"" 1 /?,.-, • 
Equation (21) may be written in the form 



(27) 



Sy(s + 1) = [F R] 



_u(s) 



and may in principle be solved for F and R. Thus from equations 
(27) it is clear that if the values for u(i) and y(i) were given and 
S were known, one could also solve for one set of values for F; and 
since in most cases H is full rank, H can be assumed to be [7 0]. 



II. THE MINIMAL REPRESENTATION AND THE DIRECT ALGORITHM. 

It was shown in Section I that, corresponding to every internal 
description of 2 which is completely controllable and completely ob- 
servable, there is a description in the form of equation (21). In this 
section we show that from the knowledge of the values of u(i), i = 1, 
• • • , N, and y(i), i = 1, • ■ • , N, one can get the internal description 
of 2 under very general conditions on u(i). Central to the discussion 
are a few results which are presented in the form of theorems for the 
sake of clarity and precision. 

Given u(i) , i = 1, • • • , N, the inputs to a system 2 of dimension n 
which is completely observable and completely reachable, and the 
corresponding outputs y(i),i = 1, • • • , N, the following propositions 
hold true: 

Note 1 : It will be assumed in the following that the column dimension 
k of the selector matrix is always a multiple of p; further if k = rp, 
then 

(r — l)p ^ ji ^ rp. 

It is obvious that there is no loss of generality involved in this assump- 
tion. (I is the row dimension of S.) 

Note 2: In the definition of y(s) and u(s) in equations (13) and (14), 
the n should be replaced by r defined in Note 1 above. 
Theorem 1 : Let S be 1 X k (= rp) ; then 

H 



HF 



HF r_1 _ 



< 1 



(28) 
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Sy(s) 
LG(s) r J 



[y'(s)S' u'(s)] is a singular matrix 
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(29) 



for every sequence u(i), i = 1, 2, ••• , N. 

Proof: Multiplying equation (10) on the left by S and replacing n by 

r we have, 



Sy(s) = S 



H 

HF 



x(s) + SR&is). 



(30) 



HF" 1 . 

Because of equation (28) 3 a vector, z^O, and in E l such that 

H 
HF 



z'S 



HF r 



= 0. 



(31) 



Therefore, multiplying equation (30) on the left by z' gives 

z'Sy(s) = z'SRMs). (32) 

Therefore, 

= Vs ^ N -r + 1 (33) 



[y'(s)S' u'(s)] is singular. QED 



Theorem 2: If 2 zs completely observable and completely reachable, the 
matrices F, G, a??d H are n X n, n X m, and p X n respectively; then 
3 an S: n X np swc/i £/m£ 



\z' -z'SRJ 


- w(«) - 


which implies that 


N-r + l 




i(<oJ 






t^E 



Sy(s) 



1 _u(s)_ 



[y'(s)S' u(s)] > almost surely (34) 
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where u(i) are random variables having a joint nonlattice distribu- 
tion* 

Proof: The first step in the proof consists of establishing Lemma 1. 
Lemma 1 : T > if and only if 



x(s) 
Ln(s)J 



[x'(s) u(s)] > 0. 



(35) 



Proof of Lemma 1: If T > 0, 3 z' such that z' = [z[ z' 2 ] j* 0, and 
z[Sy(s) + z' 2 u{s) = Vs. (36) 

Since 2 is completely observable, multiplying equation (17) 

Sy(s) = x(s) + SRMs) (37) 

on the left by z[ one obtains 

z[Sy(s) = z[x{s) + z[SR<a(s). (38) 

Combining equations (36) and (38), 

z[x{s) + (z[SR 1 - z' 2 )u{s) =0 Vs, (39) 

and 

[zi ,z[SR 1 -z' 2 ]^Q; 

for if [z[ , z^SRi - z' 2 \ = 0, then [z[ z' 2 \ = 0, which contradicts 2^0. 
Therefore, 

x(s) 



E 

- 1 WOJ 



Now suppose T > 0. Let 



"y x(s) 

-' -t*(8)J 



[z'(s) u'(s)} > 0. 



[z'(s) u(s)] > 0. 



(40) 



Then 3 a z' = [z[ z' 2 ] ^ such that 

z[x{s) 4- z' 2 u{s) = Vs. 



(41) 



Again multiplying equation (37) by z[ and using equation (41), it 
follows that 



z[Sy(s) = -z' 2 u(s) + z{SRMs) 



(42) 



*A nonlattice distribution is one in which no nonzero probability mass is 
concentrated on a surface less than the dimension of the random variable. 
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(43) 



z[Sy(s) + (z' 2 - ziSRi) u(s) = 

Once again [z[ , (z' 2 - zfSfli)] ^ since z ^ 0, which contradicts 
T > 0. The proof of Lemma 2 will now complete the proof of Theorem 2. 

Lemma 2: If (i) 2 is completely controllable, (ii) u(i) are random 
variables with a joint nonlattice distribution, then the (n 4- nm) x 
(n + nm) matrix 



x(D 
Q(D 



x(n + nm) 

u(n_4- nm)J 



(44) 



is almost surely nonsingular. 

Proof: From Lemma A.2 in Appendix A of Ref. 5 it follows that if 



(45) 



z(s + 1) = F l z(s) + G l u(s), 

with Fi(n + nm) X (n + nm) , 

then [z(l), •• • , z{n + nm)] is nonsingular with probability one, if 
F x , Gi is completely controllable. Further, from equations (8) and the 
definition of u, it is clear that 





~x(s + 1)1 




> (? • • • 


o] 


x(s) 






["ol 




u(s + 1) 




I • • ■ 





u(s) 


+ 










• • • • 


/ 








_u(s + W). 




• • • • 


0_ 


u(s + n — 1)_ 




_7_ 




w (s -f- n) 


Therefore, identifying Fi and G x as 






[f e o ■ • 


• o" 




[ol 






7 • ■ 


• 


and 


.7. 








_0 ■ • 


• o. 





(46) 



respectively, equation (44) follows since it can easily be shown that 
[F G] controllable implies that [Fi d] is completely controllable. 
Lemma 2 implies that the matrix in equation (40) is positive definite 
since in general A nonsingular -> A T A > 0, which implies equation 
(34) by (i). 
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m. THE COMPUTATIONAL METHOD 

The main part of the algorithm, as would be expected from the dis- 
cussion in Section II, is to determine the right selector matrix. Once 
this has been done it is easy to solve for the parameters. In order to 
utilize certain properties of the matrix [F' : H], a class of matrices 
S C S is detailed below since S C S, the number of different selector 
matrices one has to try, is smaller than S. 
Definition: S is the set of matrices S t §> such that S is I X k; then 

(t) k is an integral multiple of p and further, if k = rp, then 

(?■ - \)p < j t £ rp 

where ;', is as defined in equation (5). 

(ii) ji - ji-i ^ 2p. 

Observe that by (i) there always exists an S e S such that equation (11) 
holds, since as can easily be proved, if 

p([H', F'H', ■•■ , F"H']) = p([H', F'H', • • • , F' lt+1) H']) = q 

then 

P ([H', F' H', F' it+i) H']) =q j = 0, 1, 2, • • • ; 

so that in spite of condition (ii) in the above definition, there exist 
an S t S such that equation (11) holds. 

(Hi) The formulas (21) and (27) are still valid for any S e S satisfying 
equation (11), with n replaced everywhere by r defined in condition (i) 
in the definition of S above. 

Now from Theorems 1 and 2 and the above discussion, the direct 
algorithm can be summarized as follows. 

It can be assumed without loss of generality that: (i) N ^ n + 
(m + l)n; that is, there is a sufficient number of observations to deter- 
mine the internal description uniquely, n is the minimal dimension of 
the system to be identified, (ii) H has full rank, (in) n ^ p. 
Step 1: Since n ^ N/(m + 2), let N be the largest integer 
^N/(m + 2). Then n ^ N. In order to arrive at the right S, one starts 
with an S t S of row dimension N and tests the nonsingularity of 



<m + l># 



Sy(s) 
. u(s) J 



[y'(s)S f u'(s)] 



for all S e S and having row dimension N. If T is nonsingular, N = n. 
If T is singular, then reduce the row dimension of S by 1 and repeat the 
test. Repeat the procedure until T becomes nonsingular. The row di- 
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mension of S will then be n; let r be as defined in condition (t) in the 
definition of S; that is, S is n X rp. 
Step 2: Solve for F, R as follows. 

[*" R] = \L M Sy(s + l)li?(s)iS' ^(s)]}^ 1 
Siep 5: Solve for G from the following formula. 

G = R + FRt + • • • + ^"'fir-i , 

where S:n X rp and it!< are the partitions of R such that 

R = [R fix • • • fl.-x] 

and Kj = nXmi = 0, ■ • • , r — 1. H can be assumed to be [I 0] where 
the identity has dimension p. 

In the case when 2 is a continuous-time system, the algorithm 
presented above applies with appropriate modifications. In the defi- 
nitions of y(s) and u(s), s now assumes values in (R and y(s + i) should 
be replaced by y li) (s) evaluated at s. The summation signs should be re- 
placed by integration over an interval. The formulas for the parameters 
become 

[F B] - f + ' Sy {f) (sW(s)S' «'(•)] A ST" 



T 



-r 



[y'(s)S f u'(s)]ds. 



~Sy(sj 

L u(s) J 

G can be obtained from R exactly as in the above algorithm for the 
discrete time case. 

In the case when observations are contaminated with noise, this 
method can be generalized to yield consistent estimates for the param- 
eters (see Ref. 5). 
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